Multi-lingual Geographical Information Retrieval
نویسنده
چکیده
This paper reports on the results of our experiments in the Monolingual English, German and Portuguese tasks and the Bilingual German topics on English collections, English topics on German collections and English topics on Portuguese collections tasks. Seven runs were submitted as official runs, four for the monolingual task and three for the bilingual task. We used the Terrier (TERabyte RetrIEveR) Information Retrieval Platform version 2.1 to index and query the collections. Experiments were performed for both tasks using the Inverse Document Frequency model with Laplace after-effect and normalization 2. Topics were processed automatically and the only fields considered were the title and the description. We included the title field only for an experiment with the Portuguese collection. The stopword list provided by Terrier was used to index all the collections. Results for both the monolingual and bilingual tasks were low in terms of precision and recall mainly due to the following reasons: 1) no manual processing was done; 2) no query expansion based on automated relevance feedback was added; 3) no experiments including the narrative field were run; 4) no terms were translated for the bilingual task; 5) no German and Portuguese stopword lists were used instead of the default stopword list; and 6) no pre-processing or removal of diacritic marks was performed. We are running new experiments to address some of the issues aforementioned and determine the impact they have on retrieval performance.
منابع مشابه
Public Transport Ontology for Passenger Information Retrieval
Passenger information aims at improving the user-friendliness of public transport systems while influencing passenger route choices to satisfy transit user’s travel requirements. The integration of transit information from multiple agencies is a major challenge in implementation of multi-modal passenger information systems. The problem of information sharing is further compounded by the multi-l...
متن کاملApproaching the Problem of Multi-lingual Information Retrieval and Visualization in Greek and Latin and Old Norse Texts
In this paper, we explore approaches to multi-lingual information retrieval for Greek, Latin, and Old Norse texts. We also describe an information retrieval tool that allows users to formulate Greek, Latin, or Old Norse queries in English and display the results in an innovative clustering and visualization facility.
متن کاملExperiments in Cross Language Query Focused Multi-Document Summarization
The twin challenges of massive information overload via the web and ubiquitous computers present us with an unavoidable task: developing techniques to handle multilingual information robustly and efficiently, with as high quality performance as possible. Previous research activities on multilingual information access systems have studied cross-language information retrieval (CLIR), information ...
متن کاملEvaluating Multi-lingual Information Retrieval and Clustering at ULIS
This paper describes our retrieval system for NTCIR-2 Japanese/English CLIR and MLIR tasks. We integrate query and document translation with monolingual retrieval to improve retrieval accuracy, and perform clustering to improve browsing efficiency. We also introduce an entropy-driven technique in evaluating clustering methods.
متن کاملEvaluationg Multi-lingual Information Retrieval and Clustering at ULIS
This paper describes our retrieval system for NTCIR-2 Japanese/English CLIR and MLIR tasks. We integrate query and document translation with monolingual retrieval to improve retrieval accuracy, and perform clustering to improve browsing efficiency. We also introduce an entropy-driven technique in evaluating clustering methods.
متن کاملA multilingual text mining approach to web cross-lingual text retrieval
To enable concept-based cross-lingual text retrieval (CLTR) using multilingual text mining, our approach will first discover the multilingual concept–term relationships from linguistically diverse textual data relevant to a domain. Second, the multilingual concept–term relationships, in turn, are used to discover the conceptual content of the multilingual text, which is either a document contai...
متن کامل